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A general structural equation model is fitted on a panel data set 
that consists of I correlated samples. The correlated samples could 
be data from correlated populations or correlated observations from 
occasions of panel data. We consider cases in which the full pseudo- 
normal likelihood cannot be used, for example, in highly unbalanced 
data where the participating individuals do not appear in consecu- 
tive years. The model is estimated by a partial likelihood that would 
be the full and correct likelihood for independent and normal sam- 
ples. It is proved that the asymptotic standard errors (a.s.e.'s) for 
the most important parameters and an overall-fit measure are the 
same as the corresponding ones derived under the standard assump- 
tions of normality and independence for all the observations. These 
results are very important since they allow us to apply classical sta- 
tistical methods for inference, which use only first- and second-order 
moments, to correlated and nonnormal data. Via a simulation study 
we show that the a.s.e.'s based on the first two moments have negli- 
gible bias and provide less variability than the a.s.e.'s computed by 
an alternative robust estimator that utilizes up to fourth moments. 
Our methodology and results are applied to real panel data, and it 
is shown that the correlated samples cannot be formulated and ana- 
lyzed as independent samples. We also provide robust a.s.e.'s for the 
remaining parameters. Additionally, we show in the simulation that 
the efficiency loss for not considering the correlation over the samples 
is small and negligible in the cases with random and fixed variables. 
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1. Introduction. Latent variable analysis has been used widely in the 
social and behavioral sciences as well as in economics, and its use in medical 
and business applications is becoming popular. Path analysis, confirmatory 
factor analysis and latent variable models are the most popular psychometric 
models, and are all special cases of structural equation modeling (SEM). Ad- 
ditionally, in econometrics special cases of structural equation modeling are 
simultaneous equations, errors-in-variables models and dynamic panel data 
with random effects. In latent variable models, underlying subject-matter 
concepts are represented by unobservable latent variables, and their rela- 
tionships with each other and with the observed variables are specified. The 
models that express observed variables as a linear function of latent vari- 
ables are extensively used, because of their simple interpretation and the 
existence of computer packages such as EQS [9], LISREL [18] and PROC 
CALIS (SAS Institute [27]). The standard procedures in the existing com- 
puter packages assume that all the variables are normally distributed. The 
normality and linearity assumptions make the analysis and the interpreta- 
tion simple, but their applicability in practice is often questionable. In fact, it 
is rather common in many applications to use the normality-based standard 
errors and model-fit test procedures when observed variables are highly dis- 
crete, bounded, skewed or generally nonnormal. Thus, it is of practical and 
theoretical interest to examine the extent of the validity of the normality- 
based inference procedures for nonnormal data and to explore possible ways 
to parameterize and formulate a model to attain wide applicability. In the 
structural equation analysis literature, this type of research is often referred 
to as asymptotic robustness study. Most existing results on this topic have 
been for a single sample from one population. This paper addresses the prob- 
lem for multiple samples or multiple populations, and provides a unified and 
comprehensive treatment of the so-called asymptotic robustness. The em- 
phasis here is the suggestion that proper parameterization and modeling 
lead to practical usefulness and to a meaningful interpretation. It is the first 
study that shows robust asymptotic standard errors (a.s.e.'s) and overall-fit 
measures for correlated samples with fixed factors for models with latent 
variables. Novel formulas are provided for the computation of the a.s.e.'s for 
the means and variances of the fixed correlated factors. Also, in the case 
of random correlated factors we prove that the a.s.e.'s of the means for the 
factors are robust. The superiority of the suggested a.s.e.'s to the existing 
robust a.s.e.'s that involve the computation of third and fourth moments is 
shown numerically. In a simulation study, the proposed a.s.e.'s are shown 
to have less variability than the robust a.s.e.'s computed by the so-called 
sandwich estimator. Also, the simulation studies were conducted to verify 
the theoretical results, assess the use of asymptotic results in finite samples, 
show the robustness of the power for tests and demonstrate the efficiency of 
the method relative to the full-likelihood estimation method that includes 
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all the covariances of the variables over populations. The proposed method 
can be applied to all correlated data that can be grouped as a few corre- 
lated samples. In these correlated samples the observations are independent; 
for example, in panel data the correlated samples could be the occasions. 
The proposed methodology models variables within the samples and it can 
ignore the modeling of the variables between the correlated samples when 
it is impossible, for example, in highly unbalanced panel data in which the 
participating individuals do not appear in consecutive years. An application 
with real panel data from the Greek banking sector illustrates the impor- 
tance of the proposed methodology and the derived theoretical results. In 
this example, it is shown that the correlated samples cannot be formulated 
and analyzed as independent samples. 

(i) 

A general latent variable model for a multivariate observation vector v - 
with dimension pW x 1 that is an extension of the models considered by 



Anderson [3, 4], Browne and Shapiro [14] and Satorra [28, 29, 30, 31, 32, 33] 



under the following set of assumptions. The model is extended with fixed 
and correlated-over-populations latent variables. 

Assumption 1. 
(i) There are two cases: 

Case A: The variable is (a) random with mean vector ^ w and co- 
variance matrix E^i), (b) correlated over i (i.e., the measurements of the 
jth individual of the iith population are correlated with the correspond- 
ing measurements of the jth individual of the ^th population, for j < 
min{nW , }) and (c) independent over j (for each population the mea- 
surements of the observed individuals are independent). 

Case B: The variable (fp is (a) fixed with limiting mean vector /j,^) = 

lim^i)^^^ and limiting covariance matrix E^p) = hni n (i)^oo S^i) and (b) 
correlated over i [see comments in case A(b)]. 



(ii) There exists ef = (eg', eg', . . . , eg' .)', where (a) eg ~ A(0, Em), 





is 




V 





over i and j. 
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(iii) The intercepts f3^\ the coefficients B^> and the variance matrices 
of the normally distributed errors E (») can be restricted. Thus, they are 

assumed to be functions of a vector r. 

(iv) The mean vectors the variance matrices E^i) of the corre- 
lated factors and the variance matrices of the nonnormal vectors E (i) {I = 

1, . . . , Z/W) are assumed to be unrestricted. 

A common approach to verifying the identification and fitting the model 

(i) 

is to assume hypothetically that all Q 's are normally distributed and to 

concentrate on the first two moments of the observed vector i/j . The issue 
for the so-called asymptotic robustness study is to assess the validity of 

such procedures based on the assumed normality, in terms of inference for 

(i) 

unknown parameters, for a wide class of distributional assumptions on Q . 
It turns out that the type of parameterization used in the model, restricting 

the coefficient B"(r) but keeping the variances E (») of the nonnormal latent 

e e 

(i) 

variables e\- unrestricted, plays a key role in the study. 

The model, the notation and the assumptions are explained by the fol- 
lowing example. 

Example 1. A two-population (7 = 2) recursive system of simultaneous 
equations with errors in the explanatory variables is considered. The model 
is shown in (2). The system in (2) can be written in the matrix form irp = 

qj(0 + T®vf + A^\f + e<f } , which has the form of model (1) with /?W = 

(10 _ rWj-W^BW = (I« - rW)-i[AW,lW] and ef = ef\ The model 
is also a special case of the LISREL model with no latent variables in the 
dependent variables yW, that is, = rjW : in the LISREL notation. The 
latent variables (j 1 ^ and (^p are correlated for each j = 1, . . . , 500, with 
correlation 0.4. That is, the measurements of each individual from the second 
population are correlated with the measurements of one individual from 
the first population. The first population also has 500 individuals that are 
independent from all the individuals of the second population. Note that 
the number of observed variables is different for the two populations. Four 
measurements, Xj, y^J , y^ and y& , are taken from the first population 
(j/ 1 ) =4) and three measurements, x^\y^ and y^j , are taken from the 
second (p^ = 3). For j = 1, . . . ,nW, with = 1000 and = 500, 

T ( 2 ) - A 2) + P (2) 
„(2) _ r, , r A2) (2) 



.(1) _ Al 1 ) 



(1) 

0j ' 



y\ x / =Pi + SiQ' + e 



(1) _,_„(!) 
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(2) 

„(1 _ a , „(1 , x , „ 2 _ fl (2) - . 2 (2) 

y 2 j — P2 + 7i2/ij + ^C,- + e 2j , y 2 j — P2 + + °2Qj + e 2j , 

y 3 j - P3 + 72% + e 3j • 

The parameters /?i , , 71 , <5i and 5 2 do not depend on i. That is, they are 
common for the two populations. These parameters belong to the vector r. 
The variables (j 1 ^ and can be fixed or nonnormal according to cases 
A and B of Assumption 1. If all the errors are normal in accordance with 
the notation of Assumption 1, we have e^j = e®, while if e$ is normal 

and all the other errors are nonnormal, then e^- = e^*- and ef- = ef- for i = 
1,2, j = 1, . . . , nW and £ = 1, . . . , with L^ 1 ) = 3 and L( 2 ) = 2. According 
to Assumption 1, only the variances of the normal errors can be restricted 
to be the same over populations and these variances belong to the vector r. 

Further discussion about the model in (1) is given in Section 2. The model 
in (2) of Example 1 is simulated in Section 4 and used as an example to 
explain the theory in this paper. 

Latent variable analysis of multiple populations was discussed by 
Joreskog [17], Lee and Tsui [20], Muthen [23] and Satorra [29, 30]. The 
so-called asymptotic robustness of normal-based methods for latent variable 
analysis has been extensively studied in the last 15 years. For exploratory 
(unrestricted) factor analysis, Amemiya, Fuller and Pantula [2] proved that 
the limiting distribution of some estimators is the same for fixed, nonnor- 
mal and normal factors under the assumption that the errors are normally 
distributed. Browne [12] showed that the above results hold for a more gen- 
eral class of latent variable models assuming finite eighth moments for the 
factors and normal errors. Anderson and Amemiya [5], and Amemiya and 
Anderson [1] extended the above results to confirmatory factor analysis and 
nonnormal errors; they assumed finite second moments for the factors and 
errors. Browne and Shapiro [14] introduced a general linear model and used 
an approach based on the finite fourth moments that differs from that of 
Anderson and Amemiya. Considering the model of Browne and Shapiro, An- 
derson [3, 4] included nonstochastic latent variables and assumed only finite 
second moments for the nonnormal latent variables. Latent variable mod- 
els with mean and covariance structures were studied by Browne [13] and 
Satorra [28]. Satorra [29, 30, 31, 32, 33] first considered asymptotic robust- 
ness for linear latent models in multisample analysis of augmented-moment 
structures. Additional studies on the asymptotic robustness of latent vari- 
able analysis were conducted by Shapiro [37], Mooijaart and Bentler [22] 
and Satorra and Bentler [35]. 

For the one-sample problem, asymptotic distribution-free (ADF) meth- 
ods for latent variable analysis were proposed to deal with nonnormal data 
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(see, e.g., [8, 11, 23]). The ADF methods turned out to be problematic in 
practice, since the fourth-order sample moments are very variable (see, e.g., 
[15, 24]). In this paper mean and covariance structures are considered for a 
general multipopulation model that contains fixed, normal and nonnormal 
variables; some of the nonnormal variables are allowed to be correlated over 
populations. We use the approach of Anderson and Amemiya [5] to show 
that the normal-based methods are applicable for nonnormal and nonran- 
dom data assuming finite second-order moments. We also use extensively 
theory and notation from matrix analysis (see, e.g., [16, 21]). 

Section 2 explains the suggested parameterization and the estimation 
procedure. The theoretical results are derived and discussed in Section 3. 
Section 4 reports results from simulation studies and that the proposed 
asymptotic standard errors seem to be numerically more efficient than those 
derived by the sandwich estimator. Our methodology and the theoretical 
results are applied and explained in Section 5 by the fit of an econometric 
model with latent economic factors to real data. 

2. Model, parameterization and procedure. In this paper we study the 
model (1) introduced in Section 1. We consider I populations and we as- 
sume that u" individuals are sampled from the zth population, i = 1, . . . , 7, 
and that p^ measurements are taken from each sampled individual in the 
ith population. Denote the multisample data set by Ua ,i = 1, . . . = 

l,...,n^, where v^ 1 is the x 1 measurement vector from the jth in- 
dividual in the ith population. We consider a very general latent variable 
model that includes models widely used in single population cases and cov- 
ers a large class of distributional situations in one form. To cover various 
distributional settings, it is convenient to assume that the observed vector 
Vj can be written as a linear combination of + 2 independent latent 
vectors and that the latent vectors can be divided into three groups: (1) a 
fixed or nonnormal vector that is correlated over populations Q , (2) a ran- 
dom vector e^j assumed to be normally distributed and (3) nonnormal 
vectors ef- [l = 1, . . . ,L^). Note that the sample size n®, the number of 
measured variables p^> and the number of latent vectors L^> generally differ 
over populations (dependent on i). The generality of this model allows us to 
deal with cases where slightly different variables are measured from different 
populations with possibly different structures. 

All normally distributed latent variables are included in e^- and their 
distribution may possibly be related through r over populations i = 1, . . . , I. 
Other unspecified or nonnormal random latent variables are divided into in- 
dependent parts I = 1, . . . j with unrestricted covariance matrices. Case 
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A of Assumption 1 with fixed Q can represent a situation where the in- 
terest is in the model fitting and estimation only for a given set of in- 

(i) 

dividuals and not for the populations. In addition, the fixed Q can be 

used in an analysis conducted conditionally on a given set of Q values. 
Such a conditional analysis may be appropriate when the individuals j = 
l,...,n^' do not form a random sample from the ith population and/or 

(i) 

when a component of v- represents some dependency over / populations. 
For example, the I populations may actually correspond to a single pop- 
ulation at / different time points. With Q being latent and fixed, the 
limits of the unobservable sample mean, and of the sample covari- 

ance matrix, E^i) , are assumed to be unknown and unrestricted. All /3w ( T ) 

and B( 1 \t) are expressed in terms of r, which represents known or re- 
stricted elements and allows functional relationships over / populations. 

Even though r also appears in £ (i) (r) , the elements of r are usually di- 

£ o 

vided into two groups: one for S and another for /?W(r) and B^'(t). 

Assumption 1 (hi) and (iv) provide a particular identifiable parameterization 
for the model in (1). For the single population case with 1 = 1, various equiv- 
alent parameterizations have been used in practice. Some place restrictions 
on covariance matrices (e.g., by standardizing latent variables) and leave 
the coefficients unrestricted. The parameterization that leaves the covariance 
matrices (and possibly some mean vectors) of latent variables unrestricted 
and that places identification restrictions only on the coefficients and in- 
tercepts is referred to as the errors-in-variables parameterization. For the 
single population parameterization with restricted covariance matri- 

ces generally has an equivalent errors-in-variables parameterization, and the 
two parameterizations with one-to-one correspondence lead to an equivalent 
interpretation. The one-sample asymptotic robustness results have shown 
that the asymptotic standard errors for the parameters in the errors-in- 
variables formulation computed under the normality assumption are valid 
for nonnormal data, but that the same does not hold under parameteriza- 
tion with restricted covariance matrices. For the multisample, the model in 
(1), we will show that the errors-in-variables type parameterization given in 
Assumption 1 provides asymptotic robustness. However, for the multisam- 
ple case there are other reasons to consider the parameterization specified in 
Assumption l(iii) and (iv). As mentioned earlier, a multipopulation study 
is conducted because the populations are thought to be different, but cer- 
tain aspects of the structure generating data are believed to be common over 
populations. Suppose that the same or similar measurements are taken from 
different populations. For example, a similar set of psychological tests may 
be given to a number of different groups, for example, two gender groups, 
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groups with different occupations or educational backgrounds, groups in 
varying socioeconomic or cultural environments, or different time points in 
the growth of a group. The subject matter or scientific interest exists in mak- 
ing inferences about some general assertion that holds commonly for various 
populations. Such interest is usually expressed as relationships among latent 
(and observed) variables that hold regardless of the location and variability 
of the variables. Then a relevant analysis is to estimate and test the relation- 
ships, and to explore the range of populations for which the relationships 
hold. The parameterization in Assumption l(iii) and (iv) with unrestricted 
£ (i) and generally structured B^ 1 '(t) corresponds very well with the scien- 

tific interest of the study, and allows an interpretation consistent with the 
practical meaning of the problem. Note that £ w , i = 1, ... ,I,£ = 1, ... , L^> , 

are unrestricted covariance matrices and do not have any relationships over 
i or £, and that /3®(t) and B^\t) can have known elements and elements 
with relationships over i and I. On the other hand, the covariance matrix 
E £ (i) of the normal latent vector can have restrictions or equality over 

populations through r. This gives the generality of the model in (1) with 
only one normal latent vector, because a block diagonal £ corresponds 

to a number of independent subvectors in the normal . In addition, the 
possibility of restrictions on £ ( t ) over populations can also be important in 
applications. For example, if the same measurement instruments are applied 
to different samples, then the variances of pure measurement errors may be 
assumed to be the same over the samples. However, the normality assump- 
tion for pure measurement errors is reasonable in most situations, and such 
errors can be included in e$ . Assumption l(iv) and (v) do not rule out la- 
tent variable variances and covariances with restrictions across populations, 
but do require the latent variables with restricted variances to be normally 
distributed. This requirement is not very restrictive in most applications, 
as discussed above, but it is needed to obtain the asymptotic robustness 
results given in the next section. The general form of f3^ l \r) and inclusion 
of the fixed latent vector allow virtually any structure for the means of the 
observed Uj. Hence, the errors-in- variables type parameterization in As- 
sumption l(ih) can solve the identification problem, provide a general and 
convenient way to represent the subject-matter theory and concepts, and 
produce asymptotic robustness results presented in the next section. 

For the multisample data vy in (1), let and s£P be the sample 
mean vector and sample covariance matrix (unbiased) for the ith popu- 
lation, i = 1, . . . ,1. It is assumed that the sample covariance matrices Si- 
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are nonsingular with probability 1. Define 



(3) 




( 




) 



c = 



( 



) 



We consider model fitting and estimation based only on c, because such 
procedures are simple and have some useful properties. Also note that As- 
sumption 1 does not specify a particular distributional form of observations 
beyond the first two moments and specifies no particular correspondence 
or relationship between samples. Let 9 be a dg x 1 vector containing all un- 
known parameters in E(c) = j(9) under the model in (1) and Assumption 1, 
and let 9 = (t' ,v')' , where r and v contain the parameters mentioned in As- 
sumption l(iv) and (v), respectively. That is, r contains parameters that 
can be restricted, while v contains the parameters that cannot be restricted 
over populations. Under the model in (1) and Assumption 1, we compute 
the expected means 



For the estimation of 9, we consider an estimator 9 obtained by minimizing 
over the parameter space 



The obtained estimator 9 is a slight modification of the normal maximum 
likelihood estimator (MLE). The exact normal MLE can be obtained if 

[(nW — l)/n^]Si- is used in place of Si- . Asymptotic results are equiva- 
lent for the two estimators. We consider 9 because it can be computed with 
existing computer packages. The form of Q(9) corresponds to the so-called 
mean and covariance structure analysis, but the existing covariance struc- 
ture computer packages without mean structure can be used to carry out 
the minimization of Q{9) using a certain technique (see, e.g., the EQS and 
LISREL manuals). Note that other estimation techniques that are asymp- 
totically equivalent to MLE can be used, such as minimum distance, which 
is actually a generalization of the generalized method of moments. In the 

next section, asymptotic distribution results for 9 are derived for a broad 
range of situations. 



^\9) = E{^) and E»(0) = £(S«). 



i 



Q(0) = ^n»{tr[SWs«- 1 (0)]-log|S«S«- 1 (0)|-p« 



(4) 
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3. Theoretical results. The main results of this paper are presented in 
Theorem 1. We now define a set of assumptions for the model in (1) that 
assumes normal and independent variables over populations under the same 
parameterization as in Assumption 1. 

Assumption IB. 

(i) For all i and j (i = 1, . . . j = 1, . . . , raW) Q 1 ' ~ N(fi^i) , S^(o) and 
are independent. 

(ii) For all £ = 0,l,...,LW,ef ~iV(0,S w ). 

(iii) The matrices and S (*) can be restricted and are assumed 

e o 

to be functions of a vector r. 

(iv) The matrices and S (<),^= 1, are assumed to be 

unrestricted. 

Theorem 1 shows similarities and differences of the limiting results for the 
two different sets of Assumptions 1 and IB. 

Theorem 1. Assume that the model in (1) holds under Assumption 1. 
In addition we make the following assumptions: 

Assumption 2. There exists lim nm _ KX) (nW/ri) = r®, where n m = minjr^ 1 ), 
...,nW} and n = J2i=i n ^ ■ 

Assumption 3. (Ve > 0)(3 <5 > 0) 3 \-y(6) - 7 (0 O )| < £ ||0 - O || < e, 
where ||x|| = %/x'x and #0 is the limiting true value of 0. 

Assumption 4. For all i = l,...,I, /3^(r), B^(t) and £ w (r) are 

twice continuously differentiable in the parameter space of r. The columns 
of the matrix 87(80) /dr 1 are linearly independent. 

Theorem 1 (cont.). 

(i) Then 

V (r) _ y(T) 
G _ v NI ' 

where Vq~^ and V^Jj are i/ie asymptotic covariance matrices of f under the 
general Assumption 1 and under the standard Assumption IB, respectively 
(the initials NI stand for normality and independence over populations and 

G stands for the general set of Assumptions 1). The matrix Vq~^ is the part 

of the matrix V V G ' that is the asymptotic covariance matrix for the estimated 
vector 8. 
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(ii) For the asymptotic covariance matrices for the mean vectors fT^i), 
(1) in case A of Assumption 1 with fixed , 

W v g _ V NI ~ ^C (i) 

holds, and (2) in case B of Assumption 1 wii/i random 

(6) V^'-V*^' 
holds. 

(iii) For i/ie asymptotic covariance matrices for vec(£ £(»)) , (1) in case A 
of Assumption 1 with fixed 

(7) V G =V NI --^(S c(i) ®E f(i )) 

holds, and (2) in case B o/ Assumption 1 urai/i random and assuming 

(i) 

that Q have finite fourth moments, 

(vec(E /,,)) (vecfE./iO) 1 ,-\ r\, 2 

(8) cW =V^ ( ^^^V^ec^k^-^i^®^) 
holds. 

(iv) JTie function Q(0), defined in (4), evaluated on its minimum Q con- 
verges to a chi-square distribution, Q(0) — — > Xg> ura'i/i g = 2~Zi=i[P^ +p® (p® + 
l)/2]-d„. 

Proof of Theorem 1. For the proof we need the following three lem- 
mas. 

Lemma 1. Assume that the model in (1) holds. If Assumptions 1, 2 and 
3 /ioW ; then as n m — > oo, 

(9) 0-^0 O . 

Proof. From Assumption 1 and the law of large numbers, c— ^->7(#o), 

which implies Q(9q) — ^->0. Since > V# and minimizes Q, we have 

Q(0)-^->O. From the last result and Assumption 2 we get 7(0) — > 7(#o)j 
and (9) holds from Assumption 3. □ 

Lemma 2. Lei # n = (tq,i>^)', where tq is the true value of r and v n 
contains the vectors fW, vec(S^(,)) and vec(S = 1, ■ • ■ ,L^ 1 ', for alli = 
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(i) Then, under the model and the assumptions considered in Lemma 1, 
and under Assumption 4, 

(10) ^Ti{e -e n ) = a ^[c - i{e n )\ + o P {i), 

where Aq is free of nW and 

(11) A = (J / ^o 1 J )- 1 J / fio 1 , 

where Jo = J(7(6 l o)) is the Jacobian ofj(6) evaluated at 6q, fig 1 = U~ 1 (9q) = 
[rW 1 )" 1 ^)] ••• © [rWoW-^o)] and fiW" 1 ^) = E^" 1 ^) © {± x 
[SW-^AJOSW-I^)]}. 

Recall that the ratios were defined in Assumption 2 and c was defined 
in (3) . 27ie symbol © is i/te direct sum for matrices. 

(ii) AZso, 

(12) Q(0) = n[c - 7 (^)]'M [c - 7 (0 n )] + o p (l) 
with M = ^o 1 (I- A ). 

Proof, (i) From Taylor's expansion and Lemma 1 it turns out that 
there exists 8* on the line segment between and 9 n such that 

(13) j / [QCe)] = J'[Q(o n )] + n[Q(o*W-e n ), 

where J and H are the Jacobian and Hessian matrices, respectively. Now 
for the Jacobian and Hessian matrices we proved that 

(14) J'[Q(0n)] = -2J , O - 1 [c- 7 (^)]+o p (n- 1 / 2 ), 

(15) H[Q(r)]^2J' o ^ 1 J o . 

The result in (10) follows if we use (14), (15) and the fact that J[Q(6)] = 
in (13). 

(ii) After doing several matrix modifications, we get the quadratic form 

(16) Q(e) = n[c - 7(0)]'n -1 (0)[c - 7(9)] + op(l). 

Also, there exists 9* on the line segment between Q and 8 n such that 

(17) 7(0)-7(^)=J[7(n](§-^»)- 
From (17) and (10) we get that 

(18) c - 7 (g) = [I - J A ] [c - 7 (0 n )] + o p (\ 
and the result follows from (16) and (18). □ 
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Lemma 3. (i) For the model in (1) under Assumption 1 it holds that 

(19) c - 7 (0 n ) = Ew, 

where E is a constant matrix, w consists of the subvectors w^\i = 1,. . . ,1 , 
and wW consists of the subvectors eW,vec(S (*) (*)) and vec(S x ( 4 ) v (i)) /or a// 

xW and y W swc/i tfiat xW / y (0 , i = 1 , . . . , I, and x« , y W = C (i) 

(ii) T/ie limiting distribution of y/nw is the same under Assumptions 1 
and IB. 

Proof, (i) We proved that the components of c — 7(# n ) are written in 
the form 

(20) u^-^\e n ) = B^ r 

(21) S I/ ( O -S I/W (0 n ) = BW 



S^(i )e (i) 
S e (o^(i) S e (i) e (i) - D e (<) 



where T) e (i) = © S (»)©•••© S (,) . The result in (19) follows by noting 



in (20) and (21) that the components of c — j(9 n ) are products of constant 
matrices (functions of BW) and the subvectors of w^, and also using the 
property vec(ABC) = (C <g) A)vec(B). 

(ii) Note that the matrix S e (») E (i) — D e (<) does not depend on S ^ ^ for t = 

1, . . . , LW. Also note that within the populations for each (i) the subvectors 
of y/nw^' are independent and their limiting distributions do not depend 
on the nonnormality of the latent variables and on the fixed latent variables 
in case A (see [4], Theorem 5.1). Now between the populations, the limiting 
covariance between wW and w( m ) for i ^ m is despite the correlation of 

(i) (m) 

Q and Q for each j. This holds because the limiting covariance between 
y/nvec(S^i) £ (i)) and yfn vec(S^( m ) e (m) ) is since the errors are assumed to 
be independent over populations. □ 

Now we return to the proof of Theorem 1. For (i) Lemmas 2(i) and 3(i) 
show that y/n(f — tq) is a linear combination of y/nw and thus the result 
follows from Lemma 3 (ii) - 

For cases (ii) and (iii) we use the respective equations 

(22) y/^dr^ - ) = v^^w - C w ) + v^(C (i) - ) , 

(23) y/nveciT,^) -S° w ) = Vnvec(S f (i) -S fW ) + y/nvec(S^i) - S° (i) ), 

where and are the true values of the corresponding parameters. 
In both (ii) and (iii), for case A with fixed factors, we need the limiting 
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distributions of the first vectors in the second parts of (22) and (23). For 
case B with random factors, we need the limiting distributions of the vectors 
in the first parts of (22) and (23). Since the procedure is the same for (ii) 
and (iii), we explain the proof only for part (iii). So for case A in (23) 
we compute the limiting covariance matrices of all three vectors under the 
Assumption IB, 

(vec(£. m )) 2 

(24) V NI c() =V 2 + ^(S c(i) ®S cW ). 

From Lemmas 2(i) and 3 it follows that the first vector of the second part 

of (23) has the same limiting distribution under Assumption 1 with fixed 

(vec(E (j) )) 

factors and under Assumption IB. Thus V 2 = V G and the result 

(vec(£ (jj )) 

follows by solving (24) for V G c 

Now for case B in (iii) we compute the limiting covariance matrices under 
Assumption IB and under Assumption 1, and we get, respectively, 

(vec(S, m )) 2 

(25) V NI c() =V^ I + ^ y (E cW ®% ) ), 

(26) V^^VG + ^Vartvec^C^')]. 

Again, from Lemmas 2 and 3 it follows that V G = Vj^j. The result follows 
by solving (25) for V^i and substituting the result in (26). 

(iv) Lemmas 2 (ii) and 3(i) show that Q{9) is a quadratic function of 
y/nw, and the result follows from Lemma 3(ii) and the known result that 

Q(0) —>Xq under Assumption IB. □ 

Theorem l(i) and (iv) actually extend Theorem 1, proved by Satorra [33] 
for independent groups, to correlated populations and it can be applied to 
any type of correlated data that can be grouped into a few groups with 
uncorrelated data (e.g., in panel data by grouping the occasions). 

To derive large sample results for minimizing (4) under the model in (1) 
and Assumption 1, we consider the case where all n^' increase to infinity at 
a common rate and use n m as the index for taking a limit in Assumption 2. 
Assumption 3 is a standard identification condition used in Lemma 1. Note 
that the true value of 9 in case A of Assumption 1 with fixed variables de- 
pends on n«, since it contains £w and S^. Thus, we denote the limit of 

the true value as 9q • Lemma 1 gives the consistency of the estimator Q that 
minimizes (4) for the model in (1). Hence, under very weak distributional 

specifications in Assumption 1, the estimator Q is consistent for the limiting 
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true value 0q. In fact, it is clear from the proof that the consistency of 
holds for any general mean and covariance structure model j(0) = E(c) sat- 
isfying c-^>7(#o)- To characterize the limiting behavior of in more detail, 
especially for the assessment of the so-called asymptotic robustness proper- 
ties, it is convenient to consider an expansion of 0, not around the true value 
or the limiting true value 0o, but around some other quantity n defined in 
Lemma 2 that depends on the unobservable sample moments of the non- 
normal latent variables £® and ef' [l = 1, . . . , L®). Thus, the limiting true 
value vq that consists of the true covariance matrices of the random latent 
variables is replaced in n by v n that consists of the unobservable sample 
moments. While statistical inference is to be made for the true value of 0, n 
with an artificial quantity v n plays a useful role in assessing the property of 

f in 0, as well as in characterizing the limiting distribution of the whole 
without specifying any moments for and ef 1 (£ = 1, . . . , L^) higher than 

second order. To obtain an expansion of around n , we need some smooth- 
ness conditions for /3W(t),-BW(t) and £ (i)(r), and the full-column rank 

£ o 

of the Jacobian matrix J [7(^0)] that are stated in Assumption 4. Since the 
linear independence of the columns of J [7(^0)] associated with the v part of 
is trivial, we need to assume only that the r part of the model is specified 
without redundancy. Thus in Assumption 4 we just assume that d^{0Q)/dr' 

is of full-column rank and Lemma 2 expresses the leading term of y/n(Q —9 n ) 
in terms of c — 7(0 n )- Note that the use of n in Lemma 2 produces an ex- 
pansion of around n with the existence of only second moments of £W 
and e^p {£. = 1, . . . , L^). It can be shown from the proof that the expansion 
in Lemma 2 holds for the general model 7(0) = E(c) and for any n with 
On -^#0 provided that s/n[c — 7(0 n )] converges in distribution. However, the 
special choice of n for the model in (1) makes the result of Lemma 2 prac- 
tically meaningful. Lemma 3 is actually the key tool in the proof that shows 
asymptotic robustness. It expresses yjn[c — 7(0 n )] i n terms of -y/nw, which 
has the same limiting distributions under Assumptions 1 and IB. Thus, the 
main difficulty in the proof of Theorem l(i) is to express \Jn(f — To) in 
terms of a vector y/nw whose limiting distribution does not depend on the 
existence of fixed, nonnormal and correlated-over-population variables. Sim- 
ilarly, we proved Theorem l(iv) by expressing Q{9) as a quadratic function 
of y/nw. The formulas in (5) and (7) in Theorem 1 show what corrections 
should be made when we have fixed variables in order to get correct asymp- 
totic standard errors for /f^i) and vec(S^(i)). These results are novel even 
for the case with one population. The formula (6) in Theorem 1(h)(2) shows 
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that the asymptotic standard errors for fT^i) are robust. Equation (8) in 
Theorem l(iii)(2) gives the limiting covariance matrix for vec(E^(<)) when 
£W are random. Formula (8) involves the computation of fourth-order cu- 
mulants of the latent variables in practice. This is possible in practice 
and we obtain satisfactory results when we use the errors-in-variables pa- 
rameterization and have normal errors. For instance, in Example 1 for the 
model in (2) with normal errors the fourth-order cumulants for £W are equal 
to the fourth-order cumulants of the observed variables for since the 
fourth-order cumulants of the normal errors are equal to 0. This technique 
was used in our simulation study and the results are illustrated in the next 
section. Note that in most practical cases the measurement errors follow a 
normal distribution. 

Although the paper refers to the multisample case the same theory and 
methodology can be applied to longitudinal data. That is, two different 
applications, correlated populations and panel data, can be considered by 
fitting the same kind of modeling and applying the results presented in 
this paper. A similar method developed for longitudinal data, known as the 
general estimating equation (GEE) method, was proposed by Liang and 
Zeger [19]. The GEE method was proposed for generalized linear models 
with univariate outcome variables. In this paper several response variables 
are observed and their relationships are explained by a few latent variables 
within the time points. It can be shown that a special case of the GEE 
method, using the identity matrix as the "working" correlation matrix, is 
a special case of the model considered in this paper. This can be done by 
treating the outcome variable and the covariates of the generalized linear 
models as observed variables in the model considered in this paper and 
setting latent variables equal to covariates by fixing error variances equal to 
zero. Thus, the results presented in this paper can be also applied to simpler 
models such as generalized linear models for longitudinal data. On the other 
hand, the use of a "working" correlation matrix as the one used in the GEE 
method, could be also used in this methodology in order to increase the 
efficiency of the method. 

Now we define a generalized version of the so-called sandwich estimator 
used by the GEE method for generalized linear models with the identity 
matrix as the "working" correlation matrix, and also used by Satorra [28, 
29, 30, 31, 32, 33] for latent variable models. We generalize this matrix for 
correlated populations and we are going to compare it with our proposed 

(0) 

matrix Vq defined in Theorem 1 theoretically and numerically. A general- 
ized version of the sandwich (S) estimator is 



(27) 



= A E(S d ) A' 



SAMPLES WITH LATENT VARIABLES 



17 



where Ao is defined in (11) and E(Sd) is the expected mean of the sample 
matrix that involves third- and fourth-order sample moments defined as 

/ 1 p(ll) 1 g(U)\ 



n( L1 ) d n 



-l^S (n) ••• 1 s (J/) 



with 



s ? > = ^£< d ?- d(i, >( d f-a w >' 



and 



<# - ( 1 

' Vve c [(„f - P('))(„f 

where i, = 1, . . . , I,j = 1, . . . , n", and n^' denotes the number of corre- 
lated individuals between the ith and the kth populations. Note that the 
form of the matrix Vg in (27) can be derived from Lemma 2. Equation (12) 
in Lemma 2 also holds if we replace n by the true value of 6, and the re- 
sult follows by noting that Var[c — 7(6*0)] = E(Sd)- Theorem 1 actually gives 

(0) 

an alternative form of some of the parts of the matrix V g . The parts of 

(0) 

the matrix Vq defined in Theorem 1 are actually theoretically exactly the 
same as the corresponding parts of the matrix V g . In practice, the matrix 

Ao = A(#o) is estimated by Ao = A(#) and the matrix E(Sd) is estimated 

(0) (o\ 

by Sd- Despite the fact that the two matrices V£ ; and V s 7 are theoreti- 
cally equal in practice, the asymptotic standard errors (a.s.e.'s) computed 
by the matrix Vq have less variability than the a.s.e.'s computed by the 

(0) (6) 

matrix V g . This happens because the estimation of V g involves third- 
and fourth-order moments that are more variable than the second moments 
of the matrix Vq . The matrix V G involves fourth moments only in the 
formula of Theorem l(iii)(2), but these moments do not affect the computa- 

(0) 

tion of the other a.s.e.'s. This advantage of using the matrix V G is shown 
in the simulation study in the next section. 



4. Simulation study. We simulate the model in (2) of Example 1. A 
sample from both populations was generated 1000 times. The simulation 
was done twice: once with fixed and once with random £W (cases 

A and B of Assumption 1, respectively). In both cases, 

and Cf 

arc 
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related (correlated over populations) and were generated as linear combina- 
tions of chi-square random variables with 10 degrees of freedom. In case A, 
a sample of (Q , Cj ) was generated with sample means, variances and co- 
variance C (1) =4.95, C^ 2 ) =9.95, s 2 (1) = 1.97,s 2 {2) = 1.95 and s cm ^ 2 ) = 1.36, 
respectively, and the set of (Cj^Cj 2 ^) was used in all 1000 Monte Carlo 
samples. In case B, 1000 independent samples were generated for {c!f\j = 

(2) 

1, . . . , 1000; Q , j = 1, . . . , 500}. The true means, variances and covariance of 

Cj 1 ' and C] 2) are fj, (( i) = 5,/i C ( 2) = 10,cr 2 {1) = 2,<r 2 (2) = 2 and cr C (i )f(2) = 1.4. 
Note that the above means and variances are estimated, but the covariance 
°f(i)£( 2 ) is n °t, in accordance with the estimation method that we suggest. 
Note that we suggest this method for several populations with quite unbal- 
anced data. In this study it is easy to use the full likelihood and estimate the 
covariance cr^iup), but this is not always true in more complicated studies. 
By not estimating some of the covariances between the two populations, we 
lose some efficiency, for example, we obtain larger a.s.e.'s. We discuss the 
efficiency of the method in more detail later in this section. 

In both cases A and B, 1000 samples were generated for independent 
ef\i = 1,2,£ = 0,1,...,lW, with LW =3 and = 2. The errors e$,i = 
1,2, are normally distributed with mean and unknown variance <r 2 m , while 

all the other errors ef for i = l,2,£ = l,...,L«, were generated from a chi- 
square distribution with 10 degrees of freedom, x 2 , with adjusted mean 
and variance <7 2 {i) . The variance for ■ is common for the two populations, 

a 2 Q = <7 2 {1) = <7 2 ( 2 )- In both cases with fixed and random Q , the true val- 
ues for the error variances are a°f t) = a 02 ) =0.1 and ct" 2 ) = cj 02 ) =0.2, and 

e e l e 2 e 3 

the true value for the vector r is r° = (1,2,-1,-0.1,0.1,-0.01,1,0.1). The 
parameters of r are shown in the first column of the first part of Table 1. 
In accordance with the notation of this paper, the vector 9 = (t',i/)', where 

v contains c 2 {i) (i = 1,2,£= 1, . . . , ) and the means and variances of C- 

e e 

(z = 1,2). To estimate 9, we use normal MLE by minimizing (4) despite the 
appearance of fixed and nonnormal variables, and when we estimate the 
parameters, we are pretending that we do not know the true values of the 
parameters. 

Some of the results in the simulation study are shown in the first part of 
Table 1. Columns 2, 4 and 6 show results from case A with fixed , while 

columns 3, 5 and 7 show results from case B with random Q . Columns 2 

and 3 of Table 1 compare the a.s.e.'s (Gse) computed by the matrix Vq in 
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Table 1 

Results from the stimulation study* 



Efficiency of the method 
Bias of Gse Variability of Gse relative to the full likelihood 

Gse SMCse PL-MCse 

MCse GMCse FL-MCse 



Parameters r 


Fixed 


Random 


Fixed 


Random 


Fixed 


Random 


01 


1.01 


1.01 


1.63 


1.56 


0.99 


1.03 


02 


1.01 


0.99 


1.78 


1.68 


1.01 


1.05 


03 


0.97 


1.00 


1.84 


1.50 


1.00 


1.06 


7i 


1.00 


0.99 


1.44 


1.47 


1.00 


1.04 


72 


0.97 


0.99 


2.02 


1.56 


1.01 


1.05 


Si 


1.00 


1.00 


1.65 


1.57 


1.00 


1.03 


5 2 


1.00 


0.98 


1.60 


1.44 


1.02 


1.06 


„2 

o- eo 


0.99 


0.99 


2.68 


1.56 


1.00 


1.03 


Results for 


71 under 


different distribution assumptions 


— degrees of freedom for 








X 2 (di) and 








di d,2 














1 1 


1.00 


1.00 


1.59 


1.69 


1.01 


1.09 


3 3 


1.00 


1.01 


1.55 


1.43 


1.01 


1.07 


3 10 


0.99 


0.98 


1.48 


1.41 


1.01 


1.07 


10 3 


0.99 


1.00 


1.51 


1.51 


1.01 


1.04 



* Monte Carlo standard errors (MCse) for the estimated parameters in r versus the pro- 
posed a.s.e.'s (Gse) of f, computed by Vq' defined in Theorem 1. Comparison between 
the MCse for Gse (GMCse) and the MCse for the a.s.e.'s computed by the sandwich es- 
timator, Vg S ', given in (27) (SMCse). MCse computed under the full likelihood (FL) and 
under the partial likelihood (PL) . Results are shown for cases A and B of Assumption 1 
with fixed and random Q . 



Theorem l(i) with the Monte Carlo standard errors (MCse). All the ratios 
are 1 or very close to 1 and this means that the proposed a.s.e.'s have very 
small bias. Bias exists because we use the a.s.e.'s as estimates for the true 
s.e.'s of the parameters in finite samples. Actually, Lemma 1 proves that the 
bias converges to as the sample sizes increase to infinity. In this study, for 
sample sizes nS 1 ' = 1000 and = 500, the bias is negligible. 

Now we compute Monte Carlo standard errors for the a.s.e.'s computed 
by the matrix Vq (GMCse) and for the a.s.e.'s computed by the matrix 
V^f } (SMCse), defined in (27). The ratio (SMCse)/(GMCse) compares the 
variability of the two different estimates of the a.s.e.'s. This ratio is com- 
puted for the parameters in r and the results are shown in columns 4 and 5 
of Table 1 for both cases with fixed and random Q . All the ratios are signif- 
icantly larger than 1 and this fact indicates that the a.s.e.'s computed by the 
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sandwich estimator V g have larger variability than the a.s.e.'s computed 

by our suggested estimator Vq . 

Now, as to the efficiency of the method, we computed the a.s.e.'s under 
the full likelihood (FL) and under the partial likelihood (PL) given in (4). 
The ratio of the two a.s.e.'s, 

(28) efficiency = ^:^ , 

is given for all the parameters in r in the last two columns of Table 1. These 
ratios actually show the efficiency of the method relative to the FL. In both 
cases the efficiency is very satisfactory since the ratios are close to 1. The 
efficiency loss is very small for case A with fixed cjp and relatively small for 

case B with random . 

In the second part of Table 1, we make the nonnormal distributions more 
skewed to the right by changing the degrees of freedom, d\, and cfe, for 
Cj- ~ X 2 {di) and e^- ~ x 2 (^2)- All the results remain the same for case 

A with fixed and the only difference in case B with random ffi is that 
the efficiency ratio of the method relative to the full likelihood (last column) 
becomes larger but remains smaller than 1.10 even in the extreme case with 1 
degree of freedom for both d\ and cfo. Thus, the derived asymptotic standard 
errors give satisfactory results for distributions with very long tails that often 
appear in applications (e.g., in finance and banking). 

For the parameters , /i^( 2 ) , cr^ (1) and a 2 {2) we used the formulas in (5), 

(6) , (7) and (8) provided in Theorem l(ii) and (iii) and we derived results 
similar to the previous ones. It should be pointed out that the sandwich 
estimator does not provide correct a.s.e.'s for case A with fixed Q*' for 
the parameters /^(i) , /^(2) , cr^ (1) and cr^ 2) . Our novel formulas in (5) and 

(7) show what corrections should be made in order to obtain the correct 



a.s.e.'s in this case. The a.s.e.'s are evaluated at the estimated value of 6, Q. 
Note that all the a.s.e.'s are functions of 6 except the ones for (J 2 -^) and <J 2 (2) 

(elements of the matrix E^(<) in Theorem 1) that require fourth moments (or 

cumulants) for . In general, the fourth-order cumulants, ip, are prescribed 
by the following property: if x = y + z with y and z independent random 
variables, then tp x = ip y + tp z . Thus, in the model used in the simulation, it 

holds that ip x (i) = V^W +0, since the errors, e^-, are assumed to be normal, 
having fourth-order cumulants equal to 0. Thus, the sample fourth-order 
cumulants of were used for the computation of the a.s.e.'s for cr 2 ^) and 

a £(2) ■ 

The a.s.e.'s can be used for hypothesis testing of the parameters. The 
power of the tests is also robust when the sample sizes are quite large due 
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to the applicability of the multivariate central limit theorem. In the above 
simulation study, we use, as an example, Hq : S± = versus H\ : 8\ < in case 
A with fixed Q . Using level of significance a = 0.05, Hq is rejected when z < 

— 1.645 where z = Si 1 6\ - Thus, the expected power (EP) is approximately 
(29) bp W) =,(_!.,« + _£-_) =0.956, 

where the $ function is the standard cumulative normal distribution and we 
compute the power for the actual value of 5i,8{ = —0.01. We also compute 
the simulated power (SP) as 

( 30 \ gp = # of times that [^/(a.s.e. of Si)} < -1-645 = 
1 ' 1000 

Thus, the results support the robustness of power for nonnormal and cor- 
related populations. The power for overall- fit measures was investigated by 
Satorra and Saris [36] and Satora [34] in structural equation models. 

The robustness of the chi-square test statistic is shown in Table 2 for 
case A with fixed cf ■ The mean and the variance of the 1000 simulated 

values of Q(8) in (4) are close to the expected 6 and 12, respectively. Also, 
the simulated percentiles in the second row are close to the expected ones 
given in the first row of Table 2. For similar studies using simpler models, 
see [30, 32, 33] and [25]. 

In summary, the model in (1) with the errors- in- variables parameteriza- 
tion can formulate the multipopulation analysis in a meaningful fashion. The 
corresponding statistical analysis under the pseudo-normal-independence 
model gives a simple and correct way to conduct statistical inferences about 
the parameter vector r without specifying a distributional form or depen- 
dency structure over populations. In practice, r contains all the parameters 
of direct interest. The asymptotic covariance matrix and standard errors can 
be readily computed using existing procedures, and provide a good approx- 
imation in moderately sized samples. The proposed a.s.e.'s have smaller 
variability than the variability of the robust sandwich estimator, provide 
high efficiency relative to the full-likelihood method and can be used for 



Table 2 

Monte Carlo mean, variance and percentiles for the chi-square test statistics with 6 

degrees of freedom 



Mean = 6 


Variance = 12 


10% 


25% 


50% 


75% 


90% 


95% 


99% 


6.0 


11.7 


9.2 


23.6 


49.7 


75.9 


90.5 


96.3 


98.9 
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hypothesis testing with robust power. For instance, in the simulation study 
for one of the most important parameters, Si, in case A with fixed , the 
variability ratio is 1.65 (see Table 1), the efficiency ratio is 1.00 (see Table 1) 
and the power of the test Ho : Si = versus Hi : Si < is 0.967. That is, if 
the standard deviation of our proposed a.s.e. for <5i is 1, then the standard 
deviation of the a.s.e. for Si computed by the robust sandwich estimator is 
1.65. Also, our proposed a.s.e. for Si is close enough to the a.s.e. for Si using 
the full likelihood, and the power of the test is very high, 0.967, and very 
close to the expected power, 0.960. 

5. Application. An application for model (1), estimated by minimizing 
(4), and for Theorem 1 is presented by analyzing a data set from the Bank 
of Greece with annual statements for the period 1999-2003. We examine the 
relationship between asset risk and capital in the Greek banking sector. As 
capital, we use total capital over total bank assets (capital-to-asset ratio). 
The variable for total capital includes core capital (tier I) plus supplemen- 
tary capital (tier II) minus deductions such as holdings of capital of other 
credit and financial institutions. As measures for asset risk, we use the two 
main components of risk-weighted assets which reflect credit and market 
risk. There is a two-way direction effect between capital and asset risk, and 
these relationships can be analyzed in a multivariate setting with simultane- 
ous equations; see [7] for the life insurance industry. Baranoff, Papadopoulos 
and Sager [6] compared the effect of two measures for asset risk to capital 
structure by approaching latent variables for the risk factors via a dynamic 
structural equation model, and they pointed out the differences between 
large and small companies. They fitted latent variable models on a balanced 
data set concentrating on companies for which data for all years are avail- 
able. In such balanced cases we ignore companies that have been bankrupt 
or have been merged with other companies, and new companies that started 
after the first year. In many studies, researchers are interested in examining 
such companies and fit latent variables, such as macroeconomic and risk fac- 
tors or measurement errors, in a highly unbalanced data set. Papadopoulos 
and Amemiya [26] discussed the disadvantages of the existing methods for 
unbalanced data. The methodology proposed in this paper is appropriate for 
highly correlated, nonnormal and unbalanced data. Also, Theorem 1 ensures 
robust asymptotic standard errors and over all- fit measures. 

In this paper we analyze first differences of the logarithmic (In) transfor- 
mation, which actually approximate percentage changes, in order to avoid 
spurious regression, nonstationarity and cointegration to some extent. The 
explicit form of the model is 
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(31) 

t = 2000,...,2003,j = l,2,...,nW, 
. , /market risk \ w ^ ^ 

The above model is a confirmatory factor analytic model with one underlying 
factor, Q , that explains the relationships of the three observed variables, 
and it is a simple case of model (1). The parameter f3\ is fixed equal to 1, 

for identification reasons, and this actually assigns the latent factor, ^ , 
to have the same units as the corresponding observed variable. The vari- 
ables Cj*\ e 2j' an d £ 3j are assumed to follow nonnormal distributions, since 
the observed variables have long tails, which is very common for financial 
variables. These variables also have unrestricted variances over time due to 
the heteroskedasticity over time of the observed variables. By viewing efy 
as measurement error, then as a smooth and invariant latent variable over 
time it is assumed to follow a normal distribution with equal variances over 
time. Also, we assume that the autocorrelation of the observed variables is 
explained by the autocorrelation of (j^ and that the errors e®, k= 1,2,3, 
are independent over time, which is a common assumption when we analyze 
differences and applications in this analysis. In general, if there is still auto- 
correlation after taking the first differences, we can try second differences, 
and so on. 

Frequently, in finance and banking we are interested in examining the re- 
lationship between asset risk and capital ratio, particularly when the asset 
risk increases or decreases significantly. In these cases the restricted vari- 
ables of asset risk have truncated distributions, in addition to their long 
tails, and the issue of robustness of standard methods to such nonnormal 
data becomes very important and necessary. Especially in the cases with 
restricted variables, the already unbalanced data lose the appearance of the 
banks in consecutive years, since they do not satisfy the required condition 
every year. Therefore, it is very difficult and in many, if not all, applications 
it is impossible to model the time series structure. Then methodologies that 
focus on modeling relationships between variables within the occasions, such 
as the proposed model in (1), become very attractive and useful. 

Table 3 shows results for model (31) using the proposed methodology for 
all data and for data arising by restricting one of the observed variables. 
For more details, see the explanation in Table 3. Table 4 shows the explicit 
pattern of missing values for the case with market asset risk less than —0.05. 
Thus, if we try to reformulate the four correlated samples as independent 
samples based on the missing pattern of the banks, then we end up with 
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Table 3 

Results for the coefficients (3k, k = 1,2,3, of model (31) for several cases: for all available 
data (column 2) and for data that arise by restricting one of the observed variables to be 
significantly positive (> 0.05) (columns 3, 5 and 7) or be negative (< —0.05) (columns 4, 

6 and 8)* 





Without 


Restrictions on 


Restrictions on 


Restrictions on 




restrictions 


capital-to-asset ratio 


credit risk ratio 


market risk ratio 




All 


> 0.05 


< -0.05 


> 0.05 


< -0.05 


> 0.05 


< -0.05 




n = 68 


n = 23 


n = 39 


n = 37 


n = 18 


n = 26 


n = 41 


Pi 


0.96 


0.53 


0.47 


0.61 


1.00 


0.82 


1.00 




1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 




(-) 


(-) 


(-) 


(-) 


(-) 


(-) 


(-) 




0.45 


0.36 


0.57 


0.16 


-0.03 


0.54 


0.58 




0.46 


0.68 


1.21 


0.25 


-0.03 


0.66 


0.58 




(1.95) 


(1.58) 


(0.43) 


(0.94) 


(-0.13) 


(2.18) 


(4.57) 


03 


0.48 


1.00 


0.16 


1.00 


0.54 


0.73 


0.51 




0.50 


1.88 


0.34 


1.64 


0.54 


0.89 


0.51 




(1.98) 


(3.00) 


(0.56) 


(4.69) 


(2.74) 


(2.37) 


(3.76) 



*For each cell we report the standardized (first row; see [10] for a definition) and the 
unstandardized (second row) coefficients, and the value of the z test [unstandardized 
coefficient over its asymptotic standard error (a.s.e.)]. The sum of the sample sizes for the 
four years, n (2000) + n (2001) + n (2002) + n (2003) , appears in the third row for each case. 



Table 4 

Pattern of missing data for the case with differences of the ln's for market risk 

ratio < -0.05* 



Group 


Number of banks 


2000 


2001 


2002 


2003 


1 


2 











2 


2 


1 








1 





3 


2 








2 


2 


4 


4 





4 





4 


5 


1 





1 


1 





6 


3 





3 


3 


3 


7 


1 


1 





1 





8 


1 


1 





1 


1 


9 


1 


1 


1 





1 


10 


1 


1 


1 


1 





11 


1 


1 


1 


1 


1 


Total number of banks 


18 


.-) 


11 


11 


14 



*In the last four columns the nonzero numbers indicate that for the corresponding 
group (numbered in column 1) the number of banks stated in column 2 appears in 
these particular years labelled in row 1. The nonzero numbers in columns 2-6 are 
the same in each row. 
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11 independent samples that have very small sample sizes — smaller than 
four — and most of them having just one observation. Therefore, the analysis 
of balanced data is not possible since there is only one bank that appears in 
all four years that satisfies the required restriction. Also, the analysis of time 
series structure is not possible, since all samples that have banks appearing 
in any two or more consecutive years have sample sizes less than three. The 
methodology suggested in this paper can be applied to four correlated sam- 
ples with observations from the four years, respectively. The sample sizes for 
the four years are 5, 11, 11 and 14 from 2000, 2001, 2002 and 2003, respec- 
tively, and the sum of the four samples is 41 (see the last row in Table 4). 
According to our methodology, we analyze 41 observations from banks that 
appear in at least one year. In this case, there are 18 different banks that 
appear in some of the four years. It should be noted that the estimated pa- 
rameters of interest, 02 an d @3, belong to the vector r and thus, according 
to Theorem l(i), their asymptotic standard errors can be computed by the 

covariance matrix V^i . The computation of V^/ involves moments only 
of first and second order, and this issue is very important especially when 
the sample size, as in this example, is small. Only the asymptotic covari- 

(vcc(S .fj) )) 

ance matrix V G , defined in (8), requires fourth-order moments for 

its computation, and for its use we need larger sampler sizes than the sam- 
ple sizes of this example. Thus, we can fit panel data models of moderate 
sample sizes relative to the number of estimated parameters and make sta- 
tistical inference for the most important parameters without using moments 
of order higher than two in the analysis. 

Also, in the case with all banks (with no restriction on any observed vari- 
able), there are 20 different banks that provide data for some of the four 
years, creating a very unbalanced data set with only 12 banks appearing in 
all four years. As Table 3 shows in this case, if we add the banks that ap- 
pear every year, then we have a total of 68 observations from the 20 banks. 
Actually, these 68 observations were analyzed in four correlated samples, 
giving consistent estimates, and correct and efficient asymptotic standard 
errors relative to the sandwich estimator, despite the nonnormality and au- 
tocorrelation of the variables, according to Theorem 1. 

The standardized coefficients in Table 3, in the case without restrictions 
on the observed variables (column 2), indicate that the latent factor, 0% is 
strongly associated with the capital-to-asset ratio, 0.96, and has almost the 
same degree of correlation with the two measures for asset risk, 0.45 and 
0.48. The results significantly change when we restrict one of the observed 
values on significantly positive or negative. When we restrict the capital- 
to-asset ratio on positive values, the factor coincides with market risk, 
and gives a stronger and significant correlation with capital-to-asset ratio 
than the one with credit risk. The results found by restricting capital-to-asset 
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ratio on negative values are not statistically significant. When we restrict the 
credit risk ratio on positive and on negative values, the factor coincides 
with market risk and capital-to-asset ratio, respectively, and is significantly 
correlated with capital-to-asset ratio and market risk, respectively, 0.61 and 
0.54, and not with the other variable. Comparing the results from the last 
two columns to the results of column 2, we observe that the standardized 
coefficients for 02 and (3% are higher in these columns than the ones in 
column 2. Also note that in column 7 the market risk gives a much higher 
standardized coefficient, 0.73, than the credit risk, 0.54, and indicates the 
strongest relationship between capital-to-asset ratio and asset risk. All in all, 
as expected, the capital-to-asset ratio is always positively correlated to both 
credit and market asset risk. Also, the results change when we restrict one of 
the observed variables to be positive or negative, and thus it is worthwhile. 
Even though the panel data are highly unbalanced and additionally lose their 
consecutive appearance over the years, our methodology can be applied and 
can provide correct statistical inference. 
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